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Abstract. In this paper we study, via equilibrium statistical mechanics, the properties of the 
internal energy of an Hopfield neural network whose patterns are stored continuously (Gaussian 
distributed). 

The model is shown to be equivalent to a bipartite spin glass in which one party is given by 
dichotomic neurons and the other party by Gaussian spin variables. Dealing with replicated 
systems, beyond the Mattis magnetization, we introduce two overlaps, one for each party, as 
order parameters of the theory: The first is a standard overlap among neural configurations on 
different replicas, the second is an overlap among the Gaussian spins of different replicas. 
The aim of this work is to show the existence of constraints for these order parameters close 
to ones found in many other complex systems as spin glasses and diluted networks: we find 
a class of Ghirlanda-Guerra-like identities for both the overlaps, generalizing the well known 
constraints to the neural networks, as well as new identities where noise is involved explicitly. 
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1 Introduction 



Despite the several recent progresses in statistical mechanics of complex systems avoiding 
replica trick (see for instance [12] [22]), the original Amit Gutfrund Sompolinsky theory 
(AGS) [3j[4j[5j for associative neural network is still nowadays lacking a complete rigor- 
ous mathematical backbone in this sense. 

In fact, while the low storage memory case [3] has been largely understood and even 
some generalization considered (see for instance [T8J [19J [20J [2IJ), in the high storage limit 
(number of encoded memories linearly diverging with the number of working neurons) 
nor the existence of the thermodynamic limit (clearly understood for the paradigmatic 
Sherrington-Kirkpatrick model [16]) neither a complete description of the ergodic behav- 
ior of the system [9] have been obtained yet (even though whatever has been proved is 
in agreement with AGS picture obtained via the replica trick). 

While attempting progresses in finding the critical line for ergodicity and a clear scenario 
for the replica symmetric regime, in this paper we investigate the existence, for these net- 
works, of proper order parameter constraints, typical features of complex systems (see 
for instance p][l3][Tl] for discussions on linear constraints or [10] [11] for higher order 
constraints). 

In our framework, the analogical Hopfield neural network is thought of as a bipartite spin 
glass in suitably defined variables, in which the two parties interact one another via the 
memory kernel. 

Consequently our constraints are satisfied by the following two overlaps: A standard 
overlap taking into account the similarity among replicas at the level of the neuronal 
configurations (first party), and an overlap weighting the similarity between the Gaus- 
sian spins (second party) among different replicas. 

Due to the symmetry of the interaction among the two parties, the constraints the over- 
laps obey are symmetric with respect to their permutation too: remarkably the same 
symmetry was already found in the Random Overlap Structure framework [2], where, 
when analyzing the optimal structure [TS], roughly speaking, two mean field spin glass 
models were made to interact and identities formally equivalent to our one were found 

n- 

The relations for these two overlaps coupled together are obtained with standard tech- 
niques: by avoiding divergencies in the response of the energy with respect to a change 
in the noise (which plays here the role of the temperature in material systems), and as 
consequences of the self- averaging of the internal energy. 

Furthermore we show that the internal energy of the system can be completely described 
in term of a self-overlap among the Gaussian spins which we prove to be self-averaging 
f3 almost-everywhere. 

The paper is structured as follows: In Section 2 the analogical neural network is intro- 
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duced and its related statistical mechanics framework defined. 

Section 3 deals with the a detailed study of the internal energy: it is expressed in more 
ways in terms of our overlaps and the full self- average of the spin self-overlap is shown. 
Section 4 deals with its /3-streaming evaluation as well as its self-averaging properties: 
the whole set of identities is proven in this section. 
Section 5 is left for outlook and conclusions. 



2 Definition of the neural network model 

The neural network model we use resembles several features of the original AGS one: 
It is a mean field fully connected network such that each neuron interacts with the 
whole neural community. Its memory kernel is stored into the synaptic matrix following 
the Hebb prescription [17] but, differently with respect to the original AGS theory [I], 
our memory variables are not dichotomic bit, while share the same continuous support, 
weighed by a standard distribution Af[0, 1]. 

Concretely we introduce a large network of N two-state neurons ±1 3 <Ji, i G (1, ..,N), 
which schematize the single neuronal dynamics [3] by matching the value —1 with a 
quiescent (or integrating) neuron and the value +1 with a spiking (or firing) neuron. 
They interact throughout the following synaptic matrix J^- (defined accordingly the Hebb 
rule for learning), 

^ = Ece;, (2.1) 

where each random variable £ M = represents a pattern already stored by the 

network. 

As far as we deal with equilibrium properties we are marginally concerned with the time 
scales involved in the dynamics, which however are postulated to live on, at least, three 
different time sectors: The spiking dynamics of each neuron, which happens on time 
scales much shorter than the others involved in the propagation of spikes trough the 
network, is thought effectively as instantaneous (spin- flip). In complete opposition the 
synaptic dynamics, where learning is stored into the memory kernel by updating the 
synaptic matrix, happens on time scales much slower with respect to the ones involved 
in the propagation of spikes into the network and consequently the synaptic matrix is 
frozen at the beginning such that no evolution for the memories is hallowed. 
Between these time sectors lives the one for the equilibrium of the neural network, that 
is the object of our study. 

The analysis of the network assumes that the system has already memorized k patterns 
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(no learning is investigated) and we will be interested in the case in which this number 
increases proportionally (linearly) to the system size (high storage level). 
In standard literature these patters are usually taken at random with distribution P(£f ) = 
(1/2)^m +1 + (l/2)#£At !, while we extend their support to be on the real axes weighted 
by a Gaussian probability distribution, i.e. 

P(tf) = -±=e~^'\ (2.2) 

V 27T 



Of course, avoiding pathological case, in the high storage level and in the high temper- 
ature region, the results should show robustness with respect to the particular choice of 
the probability distribution and we should recover the standard AGS theory. 
The Hamiltonian of the model is defined as follows 

1 k N 
fi=l i<j 

which, splitting the summations = \ ~~ \ Yli % enable us to write down the 

following partition function 

n k N n k N 

Wfto = £-p(^E£ef?^-^££(f."> 2 ) 

a fi=l ij fi=l i 

k N 



w/3;0ex P (-!^;Be) 2 ), (2 - 4) 



V2A^ 

/i=l i=l 

where /3, the inverse temperature in spin glass theory, denotes the level of noise in the 
network and we defined 

k N 

a fi=l ij 

Notice that the last term at the r.h.s. of eq. (12.41) does not depend on the particular 
state of the network. 

Consequently we focus just on Z((3;£). Let us apply the Hubbard Stratonovich lemma 



to linearize with respect to the bilinear quenched memories carried by the £f if we 

V 

N 



define the "Mattis magnetization" [3] as 
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we can write 
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E / n( ^2) |Hp(vWEm%) 

ix=l v /i=l 



In what follows, the following partition function, defining implicitly an effective Hamil- 
tonian, will be used: 

JJ d/xfoj exp (y — ^cr^zj, (2.8) 

fi fj,,i 

where dfi^z^) is the Gaussian measure. 

Note that, as we have mapped the neural network problem into a spin glass problem also 
the normalization factor of the effective Hamiltonian is changed coherently. 
In fact, in the high storage case, this structure clearly reflects the interaction among the 
N dichotomic spin a and the k Gaussian variables z through the random interaction 
matrix encoded by the patterns (in the low level of stored memories, where N goes to 
infinity but k remains finite such an equivalence breaks down). 

Reflecting this "bipartite" nature of the Hopfield model expressed by eq. (I2.8P we in- 
troduce two other order parameters beyond the "Mattis magnetization" (eq. (12.61) ): the 
first is the standard overlap between the replicated neurons, defined as 



i=i 

and the second is the overlap between the spins of different replicas, defined as 

k 



/i=l 



Taken F as a generic function of the neurons, we define the Boltzmann state uip(F) at a 
given level of noise f3 as 

k I 



and often we will drop the subscript (3 for the sake of simplicity. The s-replicated Boltz- 
mann measure is defined as Q = uj 1 x uj 2 x ... x u s in which all the single Boltzmann states 
are independent states at the same noise level (3~ l and share an identical distribution of 
quenched memories £. 

The average over the quenched memories will be denoted by E and for a generic function 
of these memories F(£) can be written as 

nm] = I n n V 2 F (o = / (2.12) 

of course E[£f] = and E[(£f ) 2 ] = 1. 

We use the symbol (.) to mean (.) = Efi(.). 

In the thermodynamic limit, it is assumed 

lim — = a, 

a being a given real number, parameter of the theory. 

The standard quantity of interest is the intensive quenched pressure, defined as 

A N , k ((3) = -Pf N , k (P) = -^ElnZ^(Ae), (2.13) 

where fN,k(P) = UN,k{fl) — P^SN^iP) is the free energy density, ujf,k{0) the internal 
energy density and SN,k(P) the intensive entropy. 

Assuming that the thermodynamic limit of the free energy and the internal energy exist, 
these quantities will be denoted as A(a,0), u(a,/3). 



3 Properties of the internal energy 



In this section we study the properties of the internal energy: at first we evaluate it 
explicitly in terms of our overlaps. After showing that it can be expressed via the spin 
self-overlap (pn), we prove also that (pn) is completely self- averaging. 
Let us start with the next 



Theorem 3.1 The following expressions for the internal energy density hold in the ther- 
modynamic limit 

&5C^«» = 2(1=0 ( 1 -<^>)' P' 1 ' 
^1(^0)= §(d>u)-l). (3.2) 
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Proof 

The proof, as well as several others through the paper, uses direct calculation and Wick 
theorem (see eq. fl3.3l) ) as we deal with Gaussian distributed variables like the spins and 
the memories. 

In fact we remember that for these quantities, considering /(£) as a generic well behaved 
function of the memories, the following relation (integration by parts) holds: 

E£/(£)=E%f(£). (3.3) 

So we can write 

(H N , k (a;0) = ^Elog^exp(y|^&^) (3.4) 

1 \ 

= ■ > K$i u oj(aiZ u ) = ; > E<9t. ujiaiZu) 

= 2N2 X^^V 7 ^) _ w(«7iZ M )w(«7iZ M )) 

which in the thermodynamic limit becomes 

= f ((Pll) _ (Pl29l2)) • (3.5) 

Now it is enough to show that 

(l-f3){ Pll )+f3{p 12 q 12 ) = l : (3.6) 

and the proof is complete. 
This can be achieved as follows 

(pu> = I^(iE* = lE^ 1 E/^^-^^ E «■^ ,r4,, '' (3-7) 
= j^EE /n^-^e-^^eV^^ft,^ 



p * — ' \ V iV * — ' 

H i 



l + /3(Pll) -/?<9l2Pl2>, 
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and the thesis is proven. □ 



As we saw in eq. (13.21) . we can express the internal energy via (pu). The following 
theorem is therefore important. 



Theorem 3.2 In the thermodynamic limit, (3 almost everywhere, the self-overlap pu 
completely self-averages: 



Jim (e. 2 (^ £ zl)) = Jim (Eu(k^ £ z^ = Jim (e^" 2 £ *J< 



i-e. (P11P22) = (Pu) 2 = (p?i). (3-8) 



Proof 

The proof works by direct calculations and is split in two different steps, the former 
linking the first two terms of eq. fl3.8p . the latter linking the second with the last. 
By looking at the self-averaging of the internal energy, 

lim ((u N , k (f3)-(u N , k (P)}) 2 }=0, 

we show that Eo; 2 (pii) = (Ku(pn)) 2 or in terms of overlaps (pu) 2 = (puPm)' 
Squaring both the sides of eq. ( 13.21) we get 

Jim (Eu(u(N, k){(3))Y = |^(Vi> 2 " 2<Pu> + l). (3-9) 
Now we must evaluate Eu; 2 (pn): 

^ 2 (Pn) = J^E^(^)-l)(^(, 2 )-l) 

= 7g((PnJ>22)-2( Pll ) + l). (3.10) 

Subtracting eq. (13.101) to eq. (13.91) we obtained the first part of eq. (13.81) . 

To obtain the missing relation, i.e. (pf 2 ) = (P11P22), we must work out the /3-derivative of 

the internal energy. It will involve a polynomial in the overlaps multiplied by a factor k. 
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By avoiding its /c-divergency (as we are in the high storage memory case when iV — > oo 
also k — > oo, linearly with JV) we obtain the other relation. 



dpipu) 



E 



dp 



Ea / eX P( V N Ei M 



(3.11) 



2ky/Np 



iv 



2ky/(3N 



2k^]3N 



where 



@ . .r 2 \, i \ ft 




^(zj)w 2 («7i^). 



(3.12) 



Pasting eq. (13.121) into (13.111) we get 



which gives 



+ uj(zl)u 2 (aiZ u ) + u(zl)u(zl) + u(zl)u 2 (aiZ u ) 



dfsiPn) = ^(<Pii) - (P11P22)). 



(3.13) 



As we are in the high stored pattern limit (k — > 00), in the thermodynamic limit we get 
(Pii) = (P11P22)) an d the proof is ended. □ 

Let us call p(f3) the value taken by all overlaps p aa in the infinite volume limit, and by 
rj aa the rescaled fluctuations 

Vaa = Vk(Paa ~ P) ■ (3.14) 

Then we have the following. 



Corollary 3.3 In the ergodic regime, defined by the line (3 = 1/(1 + y/a), where the 
intensive free energy is given by A(a, 0) = In 2 — |ctln(l — (3) the value of the 
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overlap p and the k-rescaled fluctuations have the following behavior 



m = j^g, (3-15) 

<A> = (T^i- (3-16) 



Proof 

From the relation A(a, 0) = In 2 — (a/2) ln(l — /3) we get 



&4(a,/?) a 1 _ a 

-^^=2—p = W iP ~ 1) > (3 - 17) 

from which immediately we get eq. (13. 151) . 
Then we write 

W = jr^w 5 ^ (<P?I> " (PllP22>) " \ m {3M) 

by which immediately we get 

«iq^i = ^(<*> -<*'*»>)■ < 3 ' 19 > 

Now, noticing that, at least in the ergodic region, in the thermodynamic limit (77117722) — * 
0, we get the result. □ 

Note that this corollary automatically implies (quPu) = in the ergodic regime, as 
it should be. 



4 Constraints 



Now we turn to the constraints: Starting with the linear identities we state the following 



Proposition 4.1 In the thermodynamic limit, and (3 almost- everywhere, the following 
generalization of the linear overlap constraints holds for the analogical neural network 

(QlA) ~ 4 (<?12Pl2<?23P23) + 3(gi 2 Pl2g34f>34) = °- i^ 1 ) 
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Proof 

Let us address our task by looking at the (3 streaming of the internal energy density, once 
expressed via (qnPn)'- 

dpiquPn) = —^Edpu 2 (z^ai) = —^2E2u(z t ,a i )d l3 uj(z ti a i ) (4.2) 
^ Etj(z M <7 f )^ (u {z^iZyOj) -u{z^a^w{z v o 3 )\, (4.3) 



2 



Nk 

ix,i 

now we use Wick theorem on £ to get 

2 



dp{qi 2 pi2) = Tt2~Uz2 [{^(^i^j) -u}{z fJ ,(j i )u{z v a j )){u{Zf J ,a i z v a j ) 



+ uj(z^a i )uj(z u (y j )) + uj(z IJi ai){uj(z li a i a j z v z v a j ) - io(z t ,a i z u a j )uj(z u a j ) 

- Lu(z^a i )u(z^a i z u a j )u(z u a j ) + uj(z ll a i )uj(z u a j )uj(z u a j )uj(z^a i ) 

- Lu(z^a i )u(z^a i )uj(z u a j z l ,a j ) + Lu(z fl a i )Lu(z u a j )uj(z l ,a j )uj(z^ai)} 



Introducing the overlaps we have 

dp{qi2Vi2) = k((p 2 12 ql 2 ) - (p 12 qi2Pi 3 qi3) (4.4) 

- {Pl2gi2Pl3?13) + 0»12gi2P34<?34) + (p<?12Pl2 ~ (?>12<?12Pl3gi3) 

- (Pl2<?12Pl3<?13) + (^12^12^34534) - (M12P12) + (Pl2?12P34?34)- 

The several cancelations leave the following remaining terms 

dp{qi2Pi2) = k([ql 2 p\ 2 ) - 4(gi 2 pi 2 g23P23) + 3(gi 2 pi 2 g34P34)) (4.5) 
and, again in the thermodynamic limit, in the high storage case, the thesis is proved. □ 



Theorem 4.2 In the thermodynamic limit, for almost all values of (3, the following 
generalization of the quadratic Ghirlanda-Guerra relations holds for the analogical neural 
network 

(<?12Pl2<?23P23) = 7^2^12) + 7;(<?12Pl2) 2 , (4.6) 

1 2 

(<?12Pl2<?34P34) = ^(quPl2) + ^(luPu) 2 ■ (4.7) 
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Proof 

Starting from 

u,j 

with a calculation perfectly analogous of the one performed in the proof of Proposition 
14.11 we obtain the following expression 

Ct 2 

lim E(u 2 N (f3)) = -r(({p-qi2Pi2) 2 ) + 6(gi 2 pi2<?34P34) - 6(^12^12523^23) ) , (4.8) 
which must be compared with the square of the r.h.s. of eq. fl3.5p that is equal to 

Hulm = ^{P 2 ~ 2p(<?i2Pi 2 ) + (q 2 12 p 2 12 )) ■ (4.9) 

As a consequence, subtracting eq. fl4.9p to eq. (14.81) and taking into account also eq. (14 .11) 
(that we rewrite for simplicity) we get the linear system 

= (912^12) + 6(gi2Pl2<?34P34) - 6(gi 2 Pl2<?23P23) ~ (quPu) 2 (4.10) 
= (<?12Pl2) - 4(gi2j5l 2 g 2 3P23) + 3(^12^12^34^34) (4.11) 

whose solutions gives exactly the expressions reported in Theorem 14.21 □ 



Theorem 4.3 For the analogical neural network, a new class of identities, which involve 
explicit dependence on the noise of the network, holds in the thermodynamic limit; 
examples of which are 

1 = (l-/3)p + /3(q 12Pl2 ) : (4.12) 
= (l + Pp-p)(q 2 12 ) + -2/3(q 12 p 12 q 2 3 )+P(q 2 13 p 12 ). (4.13) 



Proof 

The proof of eq.( 14.12[) is simply the explicit calculation of the quantity (jpn) = Eu;^" 1 ^2 z 
as established in the derivation of eq.f }3~Tl) and that in the thermodynamic limit, remem- 
bering Theorem 13.21 (pn) = p. 

The proof of eq. fl4.13p works exactly on the line of the proof of eq.f l4.12p by simply working 
out explicitly the term Ku(k~ 2 ^ v z^z 2 ) (and so on for higher order relations). □ 
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5 Summary 



In this paper we analyzed the properties of the internal energy of an analogical neural 
network: 

At first we mapped the problem into a bipartite spin glass and evaluate its internal en- 
ergy by introducing two order parameters able to fulfil our task: a standard spin glass 
overlap comparing neural configurations (first party) on different replicas and an overlap 
among the Gaussian spins (second party) on different replicas. 

We showed that the internal energy density can be expressed via the spin self-overlap 
and proved its full self-average. 

Furthermore, for these overlaps, we investigate the presence of constraints, founding both 
the linear and the quadratic identities, as expected, being the analogical neural network 
a well known complex system. 

These constraints appear with a clear symmetric structure with respect to the two over- 
laps, which interact together in both the families. Ultimately, this symmetry reflects the 
bipartite nature of the neural networks by which interaction among the k Gaussian spins 
and the N dichotomic variables is encoded in the memory patterns £. 
Future works will be developed toward the analysis of the critical line for the ergodicity, 
extending previous results up to that line [H] and to the study of the still rather obscure 
(at the mathematical level) retrieval of the replica symmetric regime. 
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